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Q_|. Abstract 

o 

^ I We present a method to simulate color, 3-dimensional images taken with a space- 

52 ■ based observatory by building off of the established shapelets pipeline. The simulated 

galaxies exhibit complex morphologies, which are realistically correlated between, 
and include, known redshifts. The simulations are created using galaxies from the 
^ ' 4 optical and near-infrared bands {B, V, i and z) of the Hubble Ultra Deep Field 

'^ . (UDF) as a basis set to model morphologies and redshift. We include observational 

—1. I effects such as sky noise and pixelization and can add astronomical signals of interest 

^^ ' such as weak gravitational lensing. The realism of the simulations is demonstrated 

[->^ . by comparing their morphologies to the original UDF galaxies and by comparing 

CD I their distribution of ellipticities as a function of redshift and magnitude to wider 

HST COSMOS data. These simulations have already been useful for calibrating 
multicolor image analysis techniques and for better optimizing the design of pro- 
posed space telescopes. 

^. 
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1 Introduction 



As astronomical surveys become deeper and wider, analysis techniques corre- 
spondingly become more complex and demanding. To calibrate these methods, 
extensive work has already been invested in the simulation of monochromatic 
astronomical imaging. Simulation packages have been developed to incorpo- 
rate a semi-analytic model of galaxy number counts and evolution (jl|), or to 
mimic the properties of real observations (J3|). However, there are currently no 
packages able to create correlated images across several bands. 

Multi-band image simulations are firstly useful to develop and calibrate anal- 
ysis methods that use multicolor data. Many measurements in astronomy (for 
example photometry, astrometry and shape measurement) are "inverse prob- 
lems," where variation in a signal is easy to introduce but difficult to measure, 
usually due to complications involving observational seeing and noise. Sim- 
ulated data provide the best way to calibrate such methods because these 
variables can be controlled. A known astronomical signal can be inserted into 
simulated data, and the accuracy of a method can be judged by examining 
any errors in its recovery. 

One example is the measurement of weak gravitational lensing. In weak lens- 
ing, light from background galaxies is lensed by foreground matter distribu- 
tions, causing a shear (distortion) of the background galaxies' shapes. The 
distortion is easy to add during the construction of simulated data. Although 
the lensing signal is achromatic, color simulations can be used to test sophis- 
ticated measurement methods that take advantage of 

• the increased number of shear-measurable galaxies if some galaxies are only 
sufficiently bright in certain bands, 

• reduced noise on shear measurement (by \/N) if the intrinsic shapes of 
galaxies are uncorrelated between A^ bands, and 

• reduced systematic bias on shear measurement if the intrinsic shapes of 
galaxies are correlated between bands (l2|). 

One common challenge in weak lensing measurement is the deconvolution of 
galaxy shapes from the instrumental point-spread function (PSF). Since the 
PSF is different in each band, PSF-dependent biases will be averaged out by 
looking at multiple bands. Conversely, biases inherent to a method will not be 
ameliorated. Developing multicolor analysis techniques to exploit these tricks 
requires multicolor simulations. 

Multi-band image simulations are also useful to optimize the design and im- 
prove the science case for planned, multi-band imaging surveys such as SNAP 
(jj) or Euclid (|5|). These surveys require multiple bands in order to observe 
different types of galaxies, observe objects typically obscured in other bands. 



observe objects out to different redshifts, and most importantly to obtain 
photometric redshifts for galaxies. Engineering requirements for the design of 
these instruments can be derived via image simulations by measuring the (of- 
ten complex and subtle) effects of engineering parameters on scientific return 
dsl). Predictions for the scientific return of a given mission can be similarly 
estimated (0;S). 

A full demonstration of the potential gains in multicolor shear measurement, 
or a full optimization of a future space-based lensing mission is beyond the 
scope of this paper. The purpose of this paper is to present a method for 
simulating deep, multi-color space-based images with correlated morphologies 
and redshifts. The simulation pipeline we present here will serve as a basis 
for performing the optimization of both shear measurement techniques and 
future space missions in future papers. 



Our simulation pipeline generalizes the single-color method of (|3|), represent- 
ing complex galaxy morphologies as "shapelets" (|9|; [lO|). Shapelet-based sim- 
ulations are already widely used for weak lensing. The Shear TEsting Pro- 
gram (STEP) used similar simulated data to test and improve shape measure- 



ment and PSF correction methods (jlll ). Our generalization to multi-band 



3-dimensional simulations thus increases the realism and utility of a well es- 
tablished technique. 

This paper is organized as follows. In §2 we give a brief review of shapelets 
and how they can be used to generate simulated images. In §3 we present the 
methodology by which we create multi-band, 3-dimensional simulations. In §4 
we test the realism of our simulations through comparison to the real HST 
data. Lastly, in §5, we discuss the conclusions and summarize our findings. 



2 Background 



Shapelets, or 2-dimensional Gaussian-weighted Laguerre polynomials, form a 
complete, orthonormal basis able to represent any localized image, including a 
galaxy shape, in a relatively small number of coefficients. Any image f{x) can 
be represented as a linear combination of shapelet basis functions Xnm{2l] P)'- 



J {•£.) / , / , JnmXnm \2Li P ) ; 



where x is the pix el p osition, /„„ are the shapelet coefficients, and /3 is a 



characteristic size (jld ). Shapelets also simplify the practical processes of im- 
age convolution and deconvolution. In real space, convolution is an expen- 
sive process with computation time scaling with the square of the number 



of pixels. With shapelets, convolution becomes a computationally inexpensive 
matrix operation (|9|), and deconvolution merely requires a matrix inversion 



(1l2l). This is advantageous when it is necessary to deconvolve with a PSF and 



re-convolve with a different PSF. 

Shapelet coefficients thus form a multi-dimensional parameter space that de- 
scribes a galaxy. In general, any possible galaxy morphology can be thought of 
as a point in this multi-dimensional parameter space. When the shapelet coeffi- 
cients of a set of observed galaxies are placed in this space, various correlations 
emerge. Different directions in parameter space correspond to characteristics 
of the galaxy such as size, ellipticity, or the number of spiral arms. A classic 



example of this effect is the Hubble tuning-fork diagram (Il3l : Il4l : Il5l ). which 
parametrizes galaxies' ellipsoid, bulge/disk ratio, and how tightly wound the 
spiral arms are. The shapelets method increases the dimensionality of the 
parametrization with axes corresponding to galaxies' magnitude, size (/?), and 
polar shapelet coefficients (jSl). 

Real galaxy morphologies only occupy a small region of this multi-dimensional 
parameter space. Most regions of parameter space, corresponding to random 
shapelet coefficients, do not produce an image that resembles a galaxy. To 
manufacture useful simulations, it is essential to map the region corresponding 
to morphologically realistic galaxies. This region will constitute a probability 
density function (PDF), from which we will be able to draw simulated galaxy 
images. To acquire the PDF, we begin with a sample of real galaxies. Of course, 
the PDF is only noisily sampled by this finite set of galaxies, so we smooth it 
to obtain an approximation to the true, underlying PDF. 



3 Methodology 



In this section we present the methodology used to create the simulations. 
The process can be summarized in two steps: (1) shapelet catalog creation, 
and (2) image constitution. 



3.1 Shapelet Catalog Creation 



Since our goal is to simulate multicolor images, it is necessary to start from 
real, multicolor data. The best data for this are the Hubble Ultra-Deep Field 
(UDF) images. In this field, there are 8049 galaxies with a detection signal 
to noise ratio of at least 10 in any one band. Photometric redshifts for each 



galaxy are publicly available in the Coe et al (l8j catalogo 

We use the program shex from the Shapelets software package ^ I to decom- 
pose all of these galaxies, in all observed bands, into a linear combination 
of shapelet basis functions. We run shex up to a maximum radial oscillation 
value, or n_max in the basis functions, of 20 in order to optimize decomposi- 
tion. Although this is computationally expensive (n_MAX = 20 corresponds to 
231 coefficients), we are assured that the large objects are well modeled. This 
algorithm automatically copes with the varying pixel scale between optical 
and near-infrared imaging. To maximize the efficiency of the shapelet model, 
we iterate the center of each decomposition on the pixel grid, the maximum 
order rimax, and the scale size (3 of each decomposition independently in each 



band, using the algorithm discussed in (jlOl ). This number is recorded for later 



image reconstruction. We store catalogs of galaxy shapes in their raw form as 
well as deconvolved from the UDF PSF as modeled by the stars in the field. 

To model ri-band imaging, we thus increase the dimensionality of the shapelet 
parameter space n-fold. For example, while a bright object may be uniquely 
described in one band by 233 coefficients (including magnitude, size and red- 
shift) if N_MAX=20, it is now described by 932 coefficients. Though in the 
UDF n = 4, our simulation software is not limited to this number and could 
accept an input catalog of galaxies with more bands in the future. The UDF 
galaxies in this highly dimensional parameter space automatically contain the 
correlations between shapelet coefficients necessary to produce realistic galaxy 
images. 

We then smooth the finite number of points in shapelet parameter space, using 



an Epanechnikov kernel (|20| ) with a different smoothing length, Aj, for each 
parameter. Note that, if we choose Aj to be too small, the galaxy appears nearly 
unchanged, and we shall simply reproduce UDF galaxies in the simulated 
images; if we choose it to be too large, the galaxy is not realistic. Following the 
established smoothing scheme explored by p), we smooth the complex polar 
shapelet coefficients in modulus and phase space, setting Aj = 15° for phases 
and the mean separation between nearest neighbors in that dimension for 
moduli. To perturb the galaxy redshifts slightly, we set the redshift smoothing 
length to be ^-^, where m is a free parameter. We choose to smooth over 1 + z 
since it is used more frequently in determining cosmological parameters. We 
also choose ttt, to be 6 as a reasonable limit to conservative smoothing. If m is 
chosen to be lower than 6, the high redshift objects could get smoothed to an 
unrealistically high redshift. 



^ Available for download at http://adcam.pha.jhu.edu/~coe/UDF/. 

^ Version 2.1/3, available at http://www.astro.caltech.edu/~rjin/shapelets/. 



3.2 Image Creation 



For each simulated image, we generate a sufficient number of new galaxies that 
their density in the simulated image reproduces that in the UDF. In practice, 
rather than pixelating and drawing from the smoothed PDF, we use an equiv- 
alent Monte-Carlo bootstrap technique (J3|). For each new galaxy, an original 
UDF galaxy is selected at random and perturbed in shapelet space, within the 
smoothing kernel, to create a new galaxy. We also append a mock catalog of 
photometric redshifts to these new galaxies. These redshifts are slightly per- 
turbed from the original galaxy's observed redshift via the same smoothing 
process, wile their distribution still follows the observed distribution in the 
UDF. 

At this stage, a known weak lensing shear signal can be added to the objects. 
Similar effects could also be added to simulate, e.g. proper motions, photo- 
metric variability, or supernovae. The galaxy is finally convolved in shapelet 
space with the desired point spread function. 

Once new objects are created, they are formed into a multi-band shape catalog 
to be arranged into new images. The objects are ffist re-composed into pixe- 
lated postage stamps, then placed into large, empty arrays. The placement of 
objects is done such that the object appears at the same (RA, Dec) position 
in each band, in flux units of photons per second per pixel, and with (for the 
sake of this paper) a constant pixel scale of 0.03 arcsec per pixel. Together 
with the mock photometric redshift catalog, a 3-dimensional, color simulation 
is thus created. 

The images are made realistic by adding both a sky background and shot 
noise. Figure 1 shows an example image. The simulations could also be made 
more realistic by adding cosmic rays, variable sky background/read noise, 
or charge transfer inefficiency trailing. The background noise could also be 
smoothed with a small kernel to approximate the effects of the DRIZZLE rou- 



tine on stacked data from multiple, dithered exposures (|22| ). On the one hand, 
DRIZZLE allows for a sharper pixel scale and correction for geometric dis- 
tortions. On the other hand it produces correlated pixel noise. We have not 
enabled these options in the standard images studied in this paper, but intend 
to explore their effects in future work. 



4 Results 



We examine the morphological realism of our simulations by "blind-testing" 
them against the original UDF. We also create a '^S-function image," whereby 




Fig. 1. Sample correlated image to UDF depth. The bands shown are, clockwise 
from top- left, B (top-left), V (top-right), i (bottom-left) and z (bottom-right). The 
images are displayed on the same logarithmic scale with the same contrast. 

the smoothing length (Aj) for each coefficient is set to zero (ie, objects are 
not perturbed). This allows us to separate the morphological characteristics 
due to shapeletization and due to smoothing in shapelet space, the results of 
which are found in Table 1. Our tests are similar to those done in ([3|). We first 
consider general comparisons using photometry and size from SExtractor. 
We then examine more advanced morphological tests. 

We initially test our perturbed simulations against the UDF with a size- 
magnitude plot by plotting FWHM vs. AB magnitude. The i-band plot is given 
as a representative sample, as shown in Figure 2. The distribution should be 
the same for the simulations as for the original UDF. We use SExtractor 
parameters fwhmjmage and mag_best. We also test the simulations' real- 
ism through a magnitude histogram, a histogram of ellipticity components ei 
and 62, and a histogram of FWHM. Since one can always increase the num- 
ber of simulated galaxies in an image, the histograms are normalized to the 
same area to better observe the simulations' realism. We define the ellipticity 
components Ci and 62 to be 



'cos 20' 



62 \ sin 29 



(2) 



where a, b, and 9 are the SExtractor parameters A_IMAGE, B_IMAGE, and 
THETA_IMAGE, namely the major and minor axes and the angle between the 



major axis and the horizontal. This convention is generally adopted in weak 
lensing. 

The plots discussed above reveal a strong morphological agreement between 
the simulated images and the originals. There is a good representation of 
objects from AB magnitude 22 to 28 for all bands up to a normalization factor. 
The ellipticity histograms show a very strong representation of objects with 
ei and 62 between ±0.8. Beyond this, objects are not as well represented on 
account of the difficulty in representing highly elliptical objects as shapelets. 
Lastly, the fwhm histogram as well shows a strong agreement across all sizes. 

A more demanding test is provided by the morphological classification parame- 
ters asymmetry (A), concentration (C), and dumpiness (S). These morphology 
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characteristics have been developed in (l23l : |2J; 2a). The CAS parameters are 
defined in this work slightly differently than in ( |25l ) . We define the asymmetry, 
concentration, and dumpiness to be 



(3) 

(4) 

S^a) = 10 X ~"^;"'% '"'^' , (5) 



where Ix,y is the fiux intensity at a given pixel, I^^^ is the intensity at the 
point 180° around the origin, rgo and r2o are radii containing 80% and 20% 
of the fiux respectively, and I'^y is the intensity after the image is smoothed 
with a Gaussian kernel of width a. The definitions above do not include a 
correction for the background as the more typical versions do. This is noted, 
but the error should average to zero when many galaxies are used given that 
the noise characteristics are the same for both the UDF and the simulated 
images. We use a smoothing width a for 5" to be 5 pixels. We also use the 
Petrosian Radius (R), defined to be the radius where the surface brightness 
at that radius is equal to 20% of the surface brightness integrated within 
that radius (1261). Massey et al (2004) demonstrated that the monochromatic 
shapelet image simulations are consistent with real data by plotting A vs. C, 
A vs. S, and R vs. C for the simulations and for the real data. Another test is 
to check the mean and RMS values for A,C, and S for the simulations against 
the original UDF. It was found that the simulations relative to the Hubble 
Deep Field (HDF) demonstrated a roughly equal concentration while showing 



a lower asymmetry and dumpiness (jlOl ). Though the objects in the original 



simulations from the HDF have this discrepancy, it is relatively small, and 



it is concluded that the HDF simulations are realistic (jSl). We demonstrate a 
similar recovery here. 

Examining the CAS plots, we see generally a strong agreement between the 
simulated images and the originals. One will notice the spread in Petrosian 
Radius for the simulated images. This can be explained by the shapeletization 
of objects. We have chosen to optimize the shapelet decomposition of objects 
to completely model the wings of galaxies, but at the expense of sometimes 
truncating their central cusps. This causes an increase in the Petrosian Radius, 
but should not present a large problem to methods such as weak lensing since 
the shearing (and then PSF smearing) happens later; this is just a small change 
in the intrinsic shape of the individual object which varies far more than the 
shear signal anyway. 

In computing our results, we rejected any major outliers in the simulated 
images, for there were, on occasion, hugely asymmetric objects with unreal- 
istically high concentration and dumpiness indices. These objects were very 
large galaxies that had not properly decomposed into shapelets. These objects 
were flagged and not included in subsequent simulations. We also rejected any 
asymmetry, dumpiness, or concentration measurements in any band where a 
galaxy was so faint that the CAS routines failed to converge. We also set the 
limiting magnitude for these statistics to be 28 as computed by SExtractor 
as a balance between believable measurements and sufficiently deep galaxies. 
The results from the i-band simulated images are presented as a representative 
sample in Figures 2 through 4. A summary of results for all bands is presented 
in Table 1. The overall agreement is quite good with very little deviation from 
the original UDF. 

As we shall use these simulations for weak lensing, of particular interest is the 
intrinsic ellipticity variance (a^) , as a function of magnitude and redshift for 
shear-measurable galaxies. We include a^{z,mag) in Figure 5. We calculate 
the shear of a galaxy with the weak lensing measurement method of Rhodes, 
Refregier, and Groth (hereafter RRG) (1271). This is a mature method developed 
specifically for space-based weak lensing measurements and with thorough 
testing durin g an alysis of many Hubble Space Telescope images: including the 



Groth Strip (1281 ) . the Medium Deep Survey (J29l)^ the STIS Parallel Survey 
(J30|), and the COSMOS 2 Square Degree Survey (3V). This method was also 



used for testing during the development of the monochromatic shapelets image 
simulation pipeline described in (jSJ). We run the RRG pipeline on the simulated 
images exactly as we have run it on real HST data and make similar cuts on 
objects in order to obtain the most representative results possible. 

We define a shear-measurable galaxy to be one that passes several cuts. We 
first discard faint galaxies with S/N less than 10, where the S/N is defined as 
the the ratio of the SExtractor parameters f lux_auto to f luxerr_auto. We 



also remove galaxies with ellipticity |e| > 2, after correction for the PSF (3^ 
Note that, in the presence of image noise, especially during the PSF correction 
stage, it is possible for a moment-based shape measurement method to produce 
a non-physical ellipticity |e| > 1. This ellipticity cut also implicitly removes 
galaxies for which an iterative centroiding process in RRG failed to converge. 
This includes objects for which there was a large shift away from the initial 
position detected by SExtractor. They are usually blended objects or close 
pairs, for which an accurate shape measurement would be impossible anyway. 
We finally discard objects with size dnRc = 'J\{Ixx + lyy) (where Ixx and lyy 
are the weighted second order moments) smaller than 1.2 times that of the 
PSF. It is important that the cuts we make on the galaxies useful for lensing 
be as realistic as possible. Given the long history of the RRG method in space- 
based weak lensing measurements we feel that running the RRG pipeline on 
the simulated images is the best way to make these cuts realistic. 

In Fig. 5, we compare the RMS shear for our simulations with real data from 
COSMOS (J35l ). The high agreement within our sample error is indicative of 
the simulations' realism. Running more simulations lowers the statistical er- 
ror bars, but we are limited by sample variance due to the limited number 
of galaxies in the UDF. The high error bars seen in the figure reflect this 
uncertainty. 
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Fig. 2. FWHM vs. MAG for i-band real UDF image (left) and simulated image (right). 



5 Conclusions 



We presented a method to create an arbitrary amount of 3-dimensional, color, 
simulated, unique deep space images. The simulations are created by perturb- 
ing a galaxy's polar shapelet coefficients in such a way as to create unique 
but realistic objects. The previous simulation pipeline has been expanded to 
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Fig. 3. Histograms of magnitudes (top-left), ellipticity e (top-right), FWHM (bot- 
tom-left), and Petrosian Radius (bottom-right) for the i-band. Dashed lines refer to 
the original UDF while the solid lines refer to the simulated images. 
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Fig. 5. The scatter in measured shears, o"^, in the i-band, as a function of photo- 
metric redshift (left) and magnitude (right), a^ is computed to be the standard 
deviation of the shear (7), calculated in each redshift /magnitude bin. Red points 
correspond to one UDF-sized simulated image, while black crosses show measure- 
ments from the two square degree HST COSMOS survey (|35l). The simulation points 
are actually obtained after averaging many simulated images to reduce the error. 
However, since we are drawing galaxies from a finite real population of UDF galax- 
ies, the statistical error on this measurement is quickly dominated by the scatter of 
these galaxies, particularly at the bright end where their numbers are limited. We 
therefore represent the error bars as standard deviations rather than errors on the 
mean. 

correlate morphologies across four wavelength bands and now also include 
a redshift distribution. Though we currently use four wavelength bands, our 
simulation pipeline is flexible enough to include an arbitrary number of colors, 
should such a data set become available and useful. 

Our simulations were tested by comparing them to the original UDF images. 
They were found to have similar morphologies to the original galaxies. Ad- 
ditionally, the weak lensing cosmological properties of the simulations were 
tested against COSMOS HST data. Within reasonable error, our simulations 
were found to be consistent. 
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Final results table comparing morphologies of the UDF with the simulated images. 
The (5-function images refer to simulated images created without perturbing the 
shapelet coefficients (Aj). The perturbed images were created by the smoothing 
method discussed in §3.2. For all the images, the limiting AB magnitude was 28. 
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